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FOREWORD 

Among  the  responsibilities  assigned  to  the  Office  of  the  Manager,  National 
Communications  System,  is  the  management  of  the  Federal  Telecommunication 
Standards  Program.  Under  this  program,  the  NCS,  with  the  assistance  of  the 
Federal  Telecommunication  Standards  Committee  identifies,  develops,  and 
coordinates  proposed  Federal  Standards  which  either  contribute  to  the 
interoperability  of  functionally  similar  Federal  telecommunication  systems  or  to  the 
achievement  of  a  compatible  and  efficient  interface  between  computer  and 
telecommunication  systems.  In  developing  and  coordinating  these  standards,  a 
considerable  amount  of  effort  is  expended  in  initiating  and  pursuing  joint  standards 
development  efforts  with  appropriate  technical  committees  of  the  International 
Organization  for  Standardization,  and  the  International  Telegraph  and  Telephone 
Consultative  Committee  of  the  International  Telecommunication  Union.  This 
Technical  Information  Bulletin  presents  an  overview  of  an  effort  which  is 
contributing  to  the  development  of  compatible  Federal,  national,  and  international 
standards  in  the  area  of  facsimile.  It  has  been  prepared  to  inform  interested 
Federal  activities  of  the  progress  of  these  efforts.  Any  comments,  inputs  or 
statements  of  requirements  which  could  assist  in  the  advancement  of  this  work  are 
welcome  and  should  be  addressed  to: 

Office  of  the  Manager 
National  Communications  System 
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1.0  INTRODUCTION 


This  document  summarizes  work  performed  by  Delta  Information  Systems, 
Inc.  (DIS)  for  the  National  Communications  System  (NCS),  Office  of  Technology 
and  Standards.  This  office  is  responsible  for  the  management  of  the  Federal 
Telecommunications  Standards  Program,  which  develops  telecommunications 
standards,  whose  use  is  mandatory  for  all  Federal  departments  and  agencies.  The 
purpose  of  this  project,  performed  under  Task  2,  Subtask  3  of  contract  number 
DCA100-91-C-0031  during  Option  Year  3,  was  to  continue  the  work  on  color 
facsimile  that  was  begun  under  a  previous  task. 

The  digital  transmission  of  color  imagery  is  of  particular  importance  to  the 
Government  for  transmission  of  photographs,  half  tones,  maps  and  fingerprints. 
The  ITU-T  is  now  in  the  process  of  developing  standards  for  the  transmission  of 
color  imagery  as  part  of  the  facsimile  recommendations,  including  both  Group  3 
facsimile  and  Group  4  facsimile. 

The  purpose  of  this  project  was  to  continue  the  color  facsimile  work  started 
under  a  previous  task,  including  the  evaluation  of  the  use  of  default  Huffman 
tables,  optimized  ("Custom")  Huffman  tables,  default  quantization  tables,  and 
scaled  quantization  tables.  Included  in  this  effort  was  the  modification  of  existing 
Joint  Photographic  Experts  Group  (JPEG)  compression  and  decompression 
software  available  from  the  Independent  JPEG  Users  Group  to  process  Commission 
Internationale  de  I'Eclairage  L*a*b#  (CIELAB)  color  images  and  to  use  externally 
specified  Huffman  tables.  In  addition  a  conversion  program  was  written  to  convert 
CIELAB  color  space  images  to  red,  green,  blue  (RGB)  color  space  images  to  allow 
viewing  of  the  images  by  a  commercially  available  viewing  package  such  as 
HIJACK. 

This  report  is  comprised  of  five  sections.  Section  1 .0  provides  a  brief 
description  of  the  objectives  of  the  task  and  an  outline  of  the  contents  of  this 
report. 


Section  2.0  provides  some  background  on  the  use  of  custom  vs.  default 
quantization  matrices  and  Huffman  coding  tables  for  transmitting  color  FAX  images 
by  the  JPEG  (Joint  Photographic  Experts  Group)  baseline  standard. 

Section  3.0  is  a  summary  of  the  work  performed.  This  includes  the 
software  development  required  in  order  to  support  the  evaluation  runs  along  with 
the  results  of  the  evaluation  runs  themselves  and  the  conclusions  that  can  be 
drawn  from  them. 

Section  4.0  is  a  technical  discussion  comparing  Huffman  vs  Informational 
coding. 

Section  5.0  is  a  discussion  of  possible  future  plans. 
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2.0  BACKGROUND 


2.1  The  Elements  of  JPEG 

When  one  peels  away  all  the  detailed  specifications  of  the  JPEG  system  and 
examines  its  basic  ingredients,  one  finds  the  three  essential  elements  of  a  good 
data  compression  system  for  continuous-tone  imagery:  (1)  a  data  compactor  (the 
DCT),  which  packs  most  of  a  block  of  data  into  a  few  frequency  coefficients,  (2)  a 
quantizer  (the  quantization  matrices  and  compression  scale  factor),  which  controls 
the  trade-off  between  data  compression  and  fidelity,  and  (3)  a  variable  length  or 
"entropy"  coder  (Huffman  or  arithmetic  coder)  to  encode  the  data  that  must  be 
transmitted  after  compaction  and  quantization.  Most  recent  work  has  been  aimed 
at  employing  JPEG  for  some  specific  purpose,  such  as  color  facsimile,  and  fine 
tuning  the  quantization  matrices  and  Huffman  codes.  This  report  describes  the 
work  done  by  Delta  Information  Systems  using  various  Huffman  Tables  (JPEG 
default,  optimized  etc.)  within  the  JPEG  compression  algorithm. 

2.2  Default  Huffman  Coding  Tables 

One  of  the  outstanding  issues  of  the  past  few  years  has  been  whether  there 
is  a  good  default  Huffman  code.  A  "custom"  Huffman  code,  optimized  for  the 
image  to  be  transmitted,  will  always  perform  at  least  as  well  as  any  other  Huffman 
code,  and  usually  better.  Normally  the  use  of  a  custom  code  has  two  major 
drawbacks.  One  is  that  the  transmitter  must  make  two  coding  passes:  one  to 
collect  the  image  statistics  and  build  the  codes,  and  one  to  encode  and  transmit 
the  data.  The  other  drawback  is  that  the  transmitter  must  transmit  the  custom 
coding  tables  to  the  receiver.  It  should  be  noted  that  the  current  standard  for 
Group  3  facsimile  requires  the  sending  of  the  Huffman  tables  independent  of 
whether  they  are  custom  or  not.  Because  of  this  operational  requirement,  only  the 
first  drawback  exists  for  Group  3  facsimile  implementations. 

Much  of  the  effort  devoted  to  the  current  study  has  consisted  of  going  back 
to  the  basics  of  information  theory  to  show  that,  in  principle,  it  is  straightforward 
to  build  a  default  Huffman  code  (or  a  separate  code  for  each  of  the  various 
transmission  parameters  and  image  classes)  that  will  perform  as  well  as  or  better 
than  any  other  fixed  (image  independent)  Huffman  code.  One  of  the  basic  tenets 
of  information  theory  is  the  notion  of  entropy,  based  upon  the  probability  function 
of  the  symbols  being  encoded.  The  entropy  is  the  theoretical  lower  bound  on  the 
long-term  average  code  word  length  per  symbol  for  that  probability  function  when 
the  symbols  are  coded  independently.  A  Huffman  code  derived  from  the  same 
probability  function  is  guaranteed  to  produce  a  long-term  average  bit  rate  that  is  no 
more  than  one  bit  per  symbol  greater  than  the  entropy,  and  often  exceeds  the 
entropy  by  a  much  smaller  amount  than  one  bit  per  symbol.  Moreover,  this 
Huffman  code  is  optimal  in  the  sense  that  no  fixed  Huffman  code  can  yield  a 
smaller  average  number  of  bits  per  coded  symbol. 
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The  main  practical  impediment  to  generating  this  optimal  fixed  Huffman 
code  is  the  requirement  that  a  very  large  number  of  symbols  be  sampled  to  yield  a 
reliable  estimate  of  the  probability  function.  Section  3  describes  experiments  in 
which  various  Huffman  codes  were  evaluated.  The  main  conclusion  drawn  from 
the  results  of  these  experiments  was  that  the  data  sample  sizes  were  insufficient 
to  yield  close  estimates  of  the  probability  function.  If  a  number  of  independent 
experimenters  were  to  collect  a  sufficient  number  of  random  samples  from  which 
to  estimate  the  probability  function,  then  the  law  of  large  numbers  says  that  the 
estimated  probability  functions,  and  hence  the  Huffman  codes,  would  be  very 
close  to  one  another. 
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3.0  SUMMARY  OF  WORK  PERFORMED 


This  section  details  the  work  performed  by  Delta  Information  Systems  in 
evaluating  the  use  of  various  Huffman  tables  in  the  JPEG  compression  algorithm. 
This  work  was  divided  into  three  tasks.  The  first  task  was  the  software  effort 
required  to  support  the  evaluation  runs.  The  second  task  was  the  actual 
processing  of  the  evaluation  runs  themselves.  The  third  and  final  task  was  the 
review  of  the  results  of  the  runs  and  the  formation  of  conclusions.  The  details  of 
each  of  these  tasks  are  discussed  in  the  next  three  sections. 

3.1  Software  Modifications 

In  order  to  perform  the  Huffman  table  evaluation  runs,  software  was  needed 
to  be  written  and/or  modified  for  the  following  functions: 

Conversion  of  the  raw  CIELAB  color  images  into  a  file  format  suitable  for 
processing. 

Modifications  to  the  JPEG  users  group  compression  and  decompression 
software  to  process  CIELAB  images  and  to  allow  the  use  of  externally 
specified  Huffman  tables. 

Software  to  convert  a  CIELAB  color  image  to  a  Targa  format  RGB  color 
image  file  for  viewing. 

3.1.1  CIELAB  Image  Conversion 

The  CIELAB  color  image  files  acquired  for  the  JPEG  Huffman  table 
evaluations  were  of  two  different  formats  and  could  not  be  processed  directly  by 
the  JPEG  users  group  software.  One  set  of  images  contained  the  three  color 
components  of  the  image  in  the  same  file,  plane  interleaved.  The  second  set  of 
images  contained  the  three  color  components  of  the  image  in  three  separate  files. 
To  solve  this  problem  a  Delta  Information  System  CIELAB  color  image  file  format 
was  defined  and  conversion  programs  were  written  to  convert  the  CIELAB  color 
images  to  this  format  prior  to  processing  by  the  JPEG  software.  The  DIS  image  file 
format  consists  of  a  header  section  containing  the  image  size,  data  precision  etc. 
and  a  detail  section  containing  the  pixel  interleaved  LAB  color  components.  Shown 
in  Figure  3.1  is  the  image  file  structure. 
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Header 


Image  id 
Image  Type 
Image  index 
X  origin 

Y  origin 

Width 
Length 
Pix  depth 
Image  Desc 


one  byte  set  to  an  ASCII  "L" 

one  byte  -  currently  unused 

two  bytes  -  currently  unused 

two  bytes  -  currently  set  to  0  indicating  left  to 

right  scan 

two  bytes  -  currently  set  to  0  indicating  top  to 
bottom  scan 

two  bytes  -  number  of  pixels  in  a  line 
two  bytes  -  number  of  lines  in  the  image 
one  byte  -  number  of  bits  per  pixel  -  currently  24 
one  byte  -  currently  unused 

Detail 


L  component  -  pixel  one  line  one 
A  component  -  pixel  one  line  one 
B  component  -  pixel  one  line  one 
L  component  -  pixel  two  line  one 
A  component  -  pixel  two  line  one 
B  component  -  pixel  two  line  one 

i 

i 


-  one  byte 

-  one  byte 

-  one  byte 

-  one  byte 

-  one  byte 

-  one  byte 


i 

i 

L  component  -  pixel  'x'  line  one  -  one  byte 
A  component  -  pixel  'x'  line  one  -  one  byte 
B  component  -  pixel  'x'  line  one  -  one  byte 


L  component  -  pixel  'x'  line  'y'  -  one  byte 

A  component  -  pixel  'x'  line  'y'  -  one  byte 

B  component  -  pixel  'x'  line  'y'  -  one  byte 


FIGURE  3.1  DIS  CIELAB  FILE  FORMAT 
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3.1.2  JPEG  Software  Modifications 


The  compression  and  decompression  software  as  supplied  by  the  JPEG 
users  group  could  only  process  the  following  file  formats: 

PPM  -  PBMPLUS  color  format 
PGM  -  PBMPLUS  gray-scale  format 
GIF 

TARGA 

RLE  -  Utah  Raster  Toolkit 

Since  none  of  these  file  formats  was  compatible  with  our  pixel  interleaved  CIELAB 
file  format,  the  JPEG  compression  and  decompression  software  was  modified  to 
process/generate  the  DIS  CIELAB  image  file  format  discussed  in  Section  3.1.1. 

Another  requirement  of  the  JPEG  compression  software  was  the  ability  to 
process  a  color  image  using  an  externally  specified  Huffman  table  and  also  to  save 
the  Huffman  table  used  for  a  given  compression  run.  After  evaluating  the  changes 
required  to  allow  the  manipulation  of  the  Huffman  tables  within  the  JPEG  software, 
it  was  found  that  it  would  be  easier  to  manipulate  the  symbol  count  tables  used  to 
generate  the  Huffman  tables.  The  JPEG  software  was  therefore  modified  to  allow 
the  user  to  specify  an  external  symbol  count/histogram  file.  The  JPEG  software 
could  then  read  this  file  and  generate  a  Huffman  table  based  on  its  contents. 
Additionally  the  JPEG  compression  software  was  modified  to  save  the  current 
symbol  count/histogram  table  to  a  disk  file  for  subsequent  use  in  other  evaluation 
runs. 

3.1.3  CIELAB  to  RGB  Color  Space  Conversion 

To  view  the  CIELAB  color  image  files  with  the  commercially  available 
software  package  HIJACK,  software  was  written  to  convert  CIELAB  color  images 
to  RGB  color  images.  The  conversion  of  a  CIELAB  image  file  is  a  multiple  step 
process  consisting  of  the  following: 

Descale  the  eight  bit  LAB  components  to  original  LAB  values 
Convert  the  LAB  color  space  components  to  the  XYZ  color  space 
components 

Convert  the  XYZ  color  space  components  to  the  RGB  color  space 
components 

Scale  the  RGB  values  to  eight  bits  for  use  by  the  viewing  program 

The  descaling  of  the  LAB  components  is  image  dependent  but  the  remainder 
of  the  conversion  is  done  according  to  the  following  equations. 

The  LAB  to  XYZ  color  space  conversion  is  done  using  the  following  reverse 
transformations  of  the  XYZ  to  LAB  color  space  conversion. 
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X  =  Xn(((L  +  16)/116  +  a/500)third  power) 

V  =  Vn(((L  +  1 6)/1 1 6)third  power) 

Z  =  Zn(((L  +  16J/116  -  b/200)third  power) 

where  Xn,  Yn,  Zn  are  the  Tristimulus  values  for  the  Reference  White  for  a  specific 
illuminant. 

The  XYZ  to  RGB  color  space  conversion  is  done  using  the  following  inverse 
transformations  of  the  RGB  to  XYZ  color  space  conversion. 

r  value  =  (1 .91 1  #  X)  -  (0.534  *  Y)  -  (0.290  #  Z) 
g  value  =  -(0.985  *  X)  +  (1.999  *  Y)  -  (0.028  #  Z) 
b  value  =  (0.058  #  X)  -  (0.1 19  #  Y)  +  (0.902  *  Z) 

The  resultant  RGB  color  space  values  are  then  gamma  corrected  and  scaled 
to  eight  bit  integers  for  the  viewing  program. 

3.2  Results  of  Evaluation  Runs 

Listed  below  are  the  results  of  the  Huffman  Table  evaluation  runs.  The 
evaluation  runs  results  are  divided  into  two  groupings.  In  the  first  group  each  LAB 
color  image  was  compressed  using  the  following  Huffman  tables: 

-  Optimized  to  the  image  itself 

-  T.81  -  JPEG  default  Huffman  table 

-  Huffman  table  from  ITU-T  Delayed  Contribution  DIO  from  Japan 

-  Delta  Composite 

The  Delta  composite  Huffman  table  was  generated  from  the  combined 
histograms  of  the  eight  test  images.  Included  in  Appendix  A  are  plots  of  the 
histograms  of  the  four  Huffman  coded  symbol  sets  used  in  the  generation  of  the 
Delta  composite  Huffman  table.  Also  included  in  Appendix  A  are  the  plots  of  the 
histograms  of  the  FAXBALLS  image  which  show  how  widely  image  histograms  can 
differ  with  little  negative  effect  on  the  data  compression. 

Although  it  was  not  part  of  the  original  scope  of  work,  the  Huffman  tables 
were  included  from  Contribution  DIO  from  Japan.  This  was  done  primarily  to 
compare  the  results  of  their  Huffman  table  against  the  JPEG  default  and  also  the 
Delta  composite.  It  should  be  noted  that  their  "composite"  Huffman  table  was 
generated  using  two  subsamplings  and  only  three  images.  For  details  on 
Contribution  DIO  see  Appendix  B. 

In  the  second  group  of  evaluation  runs  each  image  was  compressed  using 
the  Huffman  table  generated  by  the  optimized  runs  for  each  of  the  other  LAB  color 
images.  The  evaluation  runs  were  performed  at  scale  factors  of  9,  24  and  71 . 

The  scale  factor  of  9  will  generate  a  high  quality  image  with  reduced  compression. 
The  scale  factor  of  24  will  generate  an  image  of  reasonable  quality  with  increased 
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compression.  The  scale  factor  of  71  will  generate  a  very  small  file  of  low  image 
quality.  Listed  below  are  the  results  of  all  evaluation  runs.  All  image  file  sizes  are 
given  in  bytes.  Black  and  white  versions  of  the  images  processed  can  be  found  in 
Appendix  C. 

Image  SA001_NL  -  Size  =  12,582,926 


Scale  Factor 

9 

24 

71 

Optimized 

1,413,382 

730,832 

319,345 

T.81  -  JPEG 

1,437,991 

745,805 

358,701 

Contrib  DIO 

1,431,632 

741,669 

353,769 

Delta 

1,456,336 

737,763 

339,987 

PM003  NL 

1,453,547 

753,219 

351,563 

Cl  LAB 

1,423,506 

740,357 

360,143 

APPOTLAB 

1,509,188 

748,591 

340,601 

TOYSLAB 

1,515,136 

738,269 

338,825 

FAXBALLS 

1,516,839 

761,738 

322,104 

LATOUR1 

1,573,038 

758,688 

321,980 

LATOUR2 

1,569,391 

758,875 

322,104 

Image  PM003  NL 

Scale  Factor 

-  Size  =  12,582,926 

9 

24 

71 

Optimized 

1,135,193 

560,700 

284,822 

T.81  -  JPEG 

1,161,256 

581,991 

324,281 

Contrib  DIO 

1,151,224 

568,895 

313,302 

Delta 

1,177,348 

572,712 

305,290 

SA001  NL 

1,247,329 

606,375 

332,819 

Cl  LAB 

1,165,696 

581,162 

326,804 

APPOTLAB 

1,221,377 

592,422 

310,714 

TOYSLAB 

1,260,243 

608,081 

323,425 

FAXBALLS 

1,220,607 

580,365 

298,826 

LATOUR1 

1,281,776 

600,438 

297,786 

LATOUR2 

1,270,637 

598,000 

297,431 

Image  C1_LAB 

Scale  Factor 

-  Size  =  15,728,658 

9 

24 

71 

Optimized 

2,959,458 

1,647,938 

834,134 

T.81  -  JPEG 

3,030,400 

1,659,528 

852,469 

Contrib  DIO 

3,050,316 

1,656,829 

846,693 
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Delta 


PM003_NL 

Cl  LAB 

APPOTLAB 

TOYSLAB 

FAXBALLS 

LATOUR1 

LATOUR2 


3,138,471 

1,681,105 

844,083 

3,109,657 

1,673,421 

849,160 

3,258,119 

1,700,042 

849,147 

3,258,500 

1,730,925 

857,116 

3,343,333 

1,745,020 

858,053 

3,252,919 

1,739,885 

860,634 

3,474,274 

1,804,264 

863,499 

3,452,565 

1,800,748 

865,525 

X 


Image  APPOTLAB  -  Size  =  24,160,886 


Scale  Factor 

9 

24 

71 

Optimized 

2,245,465 

1,066,939 

396,281 

T.81  -  JPEG 

2,293,869 

1,126,514 

479,767 

Contrib  DIO 

2,275,981 

1,099,993 

462,139 

Delta 

2,291,850 

1,082,191 

440,976 

SA001  NL 

2,305,114 

1,116,424 

493,769 

PM003  NL 

2,325,993 

1,106,855 

438,365 

C1LAB 

2,278,071 

1,119,215 

486,525 

TOYSLAB 

2,340,781 

1,079,254 

463,780 

FAXBALLS 

2,427,822 

1,146,556 

423,937 

LATOUR1 

2,414,091 

1,092,128 

407,649 

LATOUR2 

2,416,806 

1,090,696 

408,218 

Image  TOYSLAB 

-  Size  =  35,558,270 

Scale  Factor 

9 

24 

71 

Optimized 

3,163,614 

1,524,975 

593,394 

T.81  -  JPEG 

3,227,067 

1,587,727 

708,666 

Contrib  DIO 

3,197,180 

1,555,361 

686,838 

Delta 

3,222,968 

1,538,413 

654,687 

SA001  NL 

3,225,164 

1,553,303 

710,330 

PM003  NL 

3,266,247 

1,579,982 

670,189 

Cl  LAB 

3,189,545 

1,572,865 

715,820 

APPOTLAB 

3,280,633 

1,532,801 

656,083 

FAXBALLS 

3,400,209 

1,605,095 

634,853 

LATOUR1 

3,401,297 

1,550,249 

596,466 

LATOUR2 

3,403,397 

1,549,424 

596,834 
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Image  FAXBALLS  -  Size  -  1,572,882 


Scale  Factor 

9 

24 

71 

Optimized 

80,897 

47,554 

25,653 

T.81  -  JPEG 

83,273 

50,124 

30,685 

Contrib  DIO 

82,830 

49,310 

29,800 

Delta 

85,156 

49,729 

28,827 

SA001  NL 

87,759 

50,919 

30,790 

PM003  NL 

84,533 

49,864 

29,284 

Cl  LAB 

83,557 

50,066 

31,077 

APPOTLAB 

87,883 

51,204 

29,130 

TOYSLAB 

90,255 

51,421 

29,585 

LATOUR1 

90,035 

49,860 

26,334 

LATOUR2 

89,636 

49,749 

26,353 

Image  LATOUR1 

-  Size  =  16,906,674 

Scale  Factor 

9 

24 

71 

Optimized 

1,262,357 

533,097 

220,717 

T.81  -  JPEG 

1,288,695 

570,000 

284,478 

Contrib  DIO 

1,275,174 

557,610 

272,778 

Delta 

1,277,126 

545,219 

254,756 

SA001  NL 

1,296,753 

568,623 

291,848 

PM003  NL 

1,295,937 

557,393 

256,980 

Cl  LAB 

1,278,480 

569,794 

289,699 

APPOTLAB 

1,299,961 

545,853 

254,561 

TOYSLAB 

1,305,164 

550,099 

268,690 

LATOUR2 

1,334,552 

533,601 

226,526 

IMAGE  LATOUR2 

-  Size  =  16,906,674 

Scale  Factor 

9 

24 

71 

Optimized 

1,524,934 

674,848 

267,031 

T.81  -  JPEG 

1,554,130 

701,705 

325,141 

Contrib  DIO 

1,538,951 

688,199 

313,384 

Delta 

1,551,592 

678,911 

297,435 
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SA001  NL 

1,586,426 

697,241 

329,151 

PM003  NL 

1,564,440 

694,127 

302,283 

Cl  LAB 

1,537,945 

698,744 

329,408 

APPOTLAB 

1,608,170 

682,037 

297,007 

TOYSLAB 

1,617,216 

684,058 

306,985 

FAXBALLS 

1,636,931 

706,195 

289,987 

LATOUR1 

1,663,550 

680,237 

270,423 

3.3  Discussion  of  Results 

In  this  discussion,  the  term  "composite  Huffman  code"  means  any  one  of 
the  following:  T.81  -  JPEG,  DIO,  or  Delta.  In  all  three  cases,  a  code  word  is 
assigned  to  every  possible  symbol.  The  term  "optimized  Huffman  code"  means  a 
Huffman  code  optimized  to  a  specific  image.  Only  symbols  that  occur  within  that 
image  are  assigned  code  words.  The  term  "other  image  Huffman  code"  means  a 
Huffman  code  for  a  specific  image,  like  the  optimized  code,  except  that  a  count  of 
1  is  assigned  to  each  possible  symbol  that  never  occurs  in  that  image.  This  allows 
other  images  to  be  coded  with  the  "other  image  code"  for  a  given  image.  Since 
Huffman  code  is  the  only  code  employed  in  this  study,  the  word  "code"  in  the 
following  discussion  implies  Huffman  code. 

For  each  test  image  and  compression  scale  factor,  the  percentage  difference 
by  which  the  greatest  exceeded  the  least  bit  count  resulting  from  the  use  of  the 
three  composite  codes  was  noted.  The  percentage  difference  by  which  the  least 
bit  count  of  the  three  composite  codes  exceeded  the  bit  count  resulting  from  the 
optimized  code  was  also  calculated.  Finally,  the  percentage  difference  between 
the  greatest  and  least  bit  counts  resulting  from  employing  the  other  Huffman  image 
codes  was  computed. 

For  a  compression  scale  factor  of  9,  DIO  was  the  best  of  the  three 
composite  codes  for  seven  of  eight  images,  with  T.81  -  JPEG  being  the  best  in  the 
other.  However,  the  difference  between  the  worst  and  the  best  never  exceeded 
3.6  percent.  Therefore,  for  this  scale  factor,  one  can  conclude  that  all  three 
composite  codes  perform  approximately  equally  well.  The  spread  between  the 
worst  and  best  performances  of  the  other  image  codes  was  considerably  greater, 
ranging  from  5.5  to  1 1.6  percent  over  the  8  images  being  compressed.  The  other 
image  code  derived  from  Image  Cl  LAB  gave  the  best  performance  in  7  of  8  cases, 
and  performed  approximately  as  well  as,  and  sometimes  better  than,  the  composite 
codes.  The  best  composite  code  bit  count  exceeded  that  of  the  optimized  case  by 
0.9  to  2.4  percent  over  the  eight  test  images. 

For  a  compression  scale  factor  of  24,  the  Delta  code  was  best  for  five  of  the 
test  images,  and  DIO  was  best  in  the  other  three.  The  difference  between  the 
worst  and  best  ranged  from  1.2  to  4.6  percent.  The  spread  among  the  three 
composite  codes  is  thus  greater  than  with  a  compression  scale  factor  of  9.  The 
spread  between  the  worst  and  best  other  image  codes  ranged  from  3.1  to  7.8 
percent.  TOYSLAB  and  LATOUR2  each  were  best  in  two  cases;  no  other  image 
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was  best  in  more  than  one  case.  The  smallest  bit  count  produced  by  the  composite 
codes  was  0.6  to  5.5  percent  greater  than  that  generated  by  the  optimized  code 
over  the  test  image  set. 

For  a  compression  scale  factor  of  71,  the  Delta  code  was  the  best  of  the 
three  composite  codes  in  all  eight  cases.  The  spread  between  the  best  and  worst 
composite  codes  ranged  from  0.95  to  1 1.4  percent  depending  upon  the  image 
being  compressed.  The  spread  between  the  worst  and  best  other  image  codes 
was  1.9  to  28.6  percent.  The  performance  of  the  best  composite  code  with 
respect  to  the  optimized  code  ranged  from  1.2  to  15.5  percent. 

Thus,  the  performance  spreads  between  composite  and  optimized  codes  and 
among  the  different  composite  codes  increased  as  the  compression  scale  factor 
increased. 

3.4  Conclusions 

The  foregoing  results  indicate  that  optimizing  the  Huffman  code  gives 
marginally  better  performance  than  a  default  code  for  low  compression  (high  image 
quality),  but  considerably  better  performance  when  the  data  compression  is  high. 

The  notion  of  entropy,  which  is  the  lower  bound  on  the  average  code  word 
length  for  independently  coded  symbols,  is  based  on  the  assumption  of  an  inherent 
probability  function  for  the  symbol  set.  Section  4.0  shows  the  relationships 
among  entropy,  information,  and  Huffman  codes.  Entropy  theory  says,  in  effect, 
that  the  optimal  default  Huffman  code  is  that  which  is  derived  from  this  probability 
function.  In  the  long  run,  this  Huffman  code  will  perform  as  well  as  or  better  than 
any  other  fixed  Huffman  code.  To  approximate  this  inherent  probability  function, 
assuming  that  one  exists,  one  should  compile,  for  each  of  DC  luminance,  AC 
luminance,  DC  chrominance,  and  AC  chrominance,  a  composite  histogram  of 
symbol  occurrences  over  a  very  large  number  of  images  with  widely  varying  image 
characteristics.  The  Huffman  codes  derived  from  these  histograms  should  then  be 
used  as  default  codes.  Of  course,  optimized  codes  will  perform,  in  general,  better 
than  these  default  codes. 

It  is  concluded  that  the  composite  codes  employed  in  these  experiments 
were  compiled  from  an  insufficient  number  of  samples,  or  there  was  an  insufficient 
mix  of  image  characteristics  that  possess  their  own  peculiar  statistics.  This  would 
account  for  the  considerable  spread  among  the  performances  of  the  three 
composite  codes,  and  an  even  greater  spread  among  the  "other  image"  codes. 

It  is  also  concluded  that  if  default  Huffman  codes  are  employed,  there  should 
be  a  separate  set  for  each  combination  of  transmission  parameters  such  as  sub¬ 
sampling  and  data  compression  scale  factor.  The  inherent  probability  functions 
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mentioned  above  probably  vary  sufficiently  with  transmission  parameters  to 
warrant  separate  code  sets. 

If  image  characteristics  could  be  classified,  as  well  as  transmission 
parameters,  then  a  separate  Huffman  code  set  could  be  generated  for  each  class. 
The  transmitter  and  receiver  would  both  store  all  default  Huffman  code  tables. 

The  transmitter  would  decide  which  to  use,  based  on  transmission  parameters  and 
image  class,  and  transmit  the  appropriate  tables.  If  such  classification  produces 
default  codes  that  are  nearly  optimal  for  all  images  in  a  given  class,  then  optimized 
codes,  which  require  the  transmitter  to  gather  statistics  and  generate  the  coding 
tables,  would  rarely  be  required. 
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4.0  HUFFMAN  VS.  INFORMATIONAL  CODING 

4.1  Introduction 

There  is  a  close  parallel  between  theoretical  coding,  based  on  information 
values,  and  Huffman  coding.  While  the  former  is  not,  in  general,  realizable  in 
practice,  it  is  very  easy  to  treat  analytically.  Huffman  coding  produces  average 
code  word  lengths  that  are  typically  within  tenths  of  a  bit,  and  are  guaranteed  to 
be  within  one  bit,  per  coded  symbol  of  the  theoretical  minimum.'11  Therefore,  if 
one  computes  the  theoretical  minimum  analytically,  one  obtains  a  good,  although 
optimistic,  estimate  of  the  number  of  bits  per  symbol  achievable  with  Huffman 
coding. 

The  following  discussion  shows  the  similarity  between  information  and 
Huffman  coding,  including  the  notion  of  "optimal"  and  "default"  information  values 
that  are  the  information  counterparts  of  optimal  and  default  Huffman  codes. 

4.2  Alphabets 

Texts  describing  variable  length  coding  typically  refer  to  the  transmission  of 
"messages"  composed  of  "symbols."  The  entire  set  of  symbols  from  which  a 
message  can  be  composed  is  called  an  "alphabet."  The  information  value,  in  bits, 
of  each  symbol,  s,  is  given  by: 

/  (s)  =  -/og2  [p  (s)L 

where  p(s)  is  the  probability  of  symbol  s.  The  entropy  of  a  source  producing  such 
messages  is  given  by: 


w  ■=  E  P  («)/(«)-  -E  P  (s)  log2  tP  («)]  • 

5  5 

This  is  the  theoretical  lower  bound  on  the  average  number  of  coded  bits  per 
symbol  when  the  symbols  are  coded  independently  of  one  another. 

In  the  current  context,  there  are  four  alphabets  corresponding  to  the  four 
sets  of  data  for  DC  luminance,  AC  luminance,  DC  chrominance  and  AC 
chrominance.  Since  there  are  only  two  sets  of  symbols:  SSSS  for  DC  and 
RRRRSSSS  for  AC,  one  might  argue  that  there  are  only  two  alphabets.  Because 
the  statistics  of  luminance  and  chrominance  data  are  distinctly  different  in  both  the 
DC  and  AC  symbol  sets,  and  because  separate  Huffman  coding  tables  are  provided 
for  luminance  and  chrominance,  the  four  sets  of  data  are  treated  separately,  and 
each  is  treated  as  having  its  own  alphabet.  It  is  also  assumed  that  an  alphabet  is 
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comprised  only  of  symbols  that  actually  can  occur,  even  if  some  do  so  only  very 
rarely."  The  following  discussion  applies  to  any  one  of  these  alphabets. 

4.3  Analysis 

Consider  the  following  two  equations,  the  first  applying  to  information 
coding  and  the  second  to  Huffman  coding. 


L,  -  I  P  (s)  I  ( s )  (1) 

s 


A  -  1  P  («)  (21 

S 

where  p(s)  is  a  probability  function,  l(s)  is  defined  by: 

Ks)  =  -log 2  [(q  ( s)J  (3) 

in  which  q(s)  is  another  probability  function  that  may  be,  but  is  not  necessarily,  the 
same  as  p(s),  and  h(s)  is  a  Huffman  code  word  length.  L,  and  Lh  are  the  average 
number  of  bits  per  symbol  for  information  and  Huffman  coding  respectively. 

Equation  (1)  has  the  form  of  the  entropy  equation.  However,  l(s)  is  based 
on  a  probability  function  that  is  not  necessarily  p(s).  In  the  current  context,  the 
symbols  of  an  alphabet  do  not  have  an  inherent  probability  function  as  do  the 
results  of  coin  tosses,  dice  rolls  and  the  dealing  of  poker  hands.  The  relative 
occurrence  frequencies  of  the  symbols  in  an  alphabet  depend  upon  such 
parameters  as  sub-sampling  and  compression  scale  factor.  Even  with  constant 
parameters,  the  relative  frequencies  vary  from  image  to  image. 

For  the  purposes  of  this  discussion,  we  assume  that  a  probability  function  is 
estimated  in  the  following  manner:  Accumulate  a  histogram  of  the  number  of 
occurrences,  n(s),  of  each  symbol  in  an  alphabet  over  one  image,  or  over  many 
images,  and  divide  each  n(s)  by  the  total  number  of  occurrences  of  all  the  symbols. 
The  resulting  set  of  ratios  has  the  following  required  characteristics  of  a  probability 
function:  (1)  each  ratio  lies  in  the  closed  interval  of  0  through  1,  and  (2)  the  sum 
of  the  ratios  is  1 . 

The  only  difference  between  such  an  estimated  probability  function  and  a 
theoretical  one,  aside  from  the  fact  that  the  estimated  probabilities  are 
approximate,  is  that  a  possible  but  improbable  symbol  might  not  occur  at  all  in  the 
accumulated  data,  in  which  case  the  estimated  probability  of  the  symbol  is  0.  In  a 


JPEG  assumes  256  AC  symbols,  some  of  which  are  impossible.  This  study 
excludes  impossible  symbols  from  its  'alphabets.” 
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theoretical  discrete  probability  function,  the  probabilities  of  all  possible  symbols  are 
greater  than  0. 

The  presence  of  zero  probabilities  in  an  estimated  probability  function  means 
that  in  the  accumulated  data  some  rare  but  possible  symbols  simply  failed  to 
occur.  The  information  value,  l(s),  of  a  symbol  having  zero  probability  does  not 
exist  (-log2(p)  approaches  infinity  as  p  approaches  0),  and  the  JPEG  simulation 
program  does  not  generate  a  Huffman  code  for  non-occurring  symbols.  This  is  not 
a  problem  if  there  is  no  symbol  s  for  which  p(s)  >  0  while  q(s)  =  0,  because,  in 
the  theoretical  case,  -p  log2(p)  approaches  0  as  p  approaches  0,  and  in  the 
Huffman  case,  a  code  word  is  not  required  for  a  symbol  that  never  occurs.  If, 
however,  q(s)  is  the  probability  function  from  which  "default"  information  values 
or  Huffman  codes  are  generated,  then  a  symbol  might  occur  in  the  histogram 
which  generates  p(s)  but  not  in  the  one  from  which  q{s)  is  derived.  Consequently, 
when  the  latter  is  to  be  the  basis  of  "default"  codes,  to  each  (possible)  s  for  which 
n(s)  =  0  the  program  that  builds  the  histogram  of  n(s)  arbitrarily  assigns  a  value  of 
1  to  n(s),  thus  preventing  a  zero  value  for  q(s). 

It  is  now  shown  that,  assuming  that  p(s)  >  0  and  q(s)  >  0  for  all  s,  Lj  in 
Equation  (1)  is  minimum  when  q(s)  =  p(s).  The  proof  consists  of  minimizing,  with 
respect  to  q1(  q2,  ...,  qif  ...  qn,  the  function 

F(qv  q?  q„  ...  qj  =  -[Pj/nfqJ  +  p2/nfq2)  +  ...  +  pjnfqjj 

with  the  constraint  that  the  sum  of  the  q's  is  1 .  The  method  is  explained,  for 
example,  in  Kaplan.121  In  F,  it  is  valid  to  use  natural  logarithms  instead  of 
logarithms  to  the  base  2,  because  log2(x)  =  ln(x)  /  ln(2).  Consequently,  the  q/s 
that  minimize  F  also  minimize  Lj. 

For  the  case  at  hand,  let  G(q1(  q2,...)  be  the  constraint  function  expressed 
as: 


G(qu  q2,  ...,  qj  =  q,  +  q2  +  ...  +  qn-1  =  0. 

Finding  a  critical  point  in  the  n-dimensional  space  consists  of,  for  each  i  =  1,2, 
...,  n,  adding  the  partial  derivative  of  F  with  respect  to  qj  to  a  constant,  m,  times 
the  partial  derivative  of  G  with  respect  to  q;  and  setting  the  result  to  zero.  The 
resulting  equation  for  each  symbol  i  is: 

-Pi/  Qi  +  m  =  0,  (4) 

whence 

Qi  =  Pi/ m. 
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Putting  in  the  constraint  that  the  sum  of  the  q's  is  1  gives: 


E ft  -  (1/m)  Eft  -  •  (5) 

/  / 

Since  the  sum  of  the  p's  is  also  1,  Equations  (4)  and  (5)  are  satisfied  when  and 
only  when  m  =  1  and  q(  =  p(  for  all  i.  Thus,  the  critical  point  is  the  n-dimensional 
point  q(  =  Pj  for  all  i.  The  fact  that  this  point  is  a  minimum  is  established  by  taking 
the  second  partial  derivatives  of  F  with  respect  to  qj.  When  q;  =  pj(  the  second 
derivatives  are  1/pif  which,  for  P|  >  0,  exist  and  are  positive. 

4.4  The  Comparison 

For  each  alphabet,  let  p(s)  now  be  defined  as  the  estimated  probability 
function  derived  from  the  histogram  compiled  from  the  transmission  of  one  image, 
which  will  be  called  the  test  image.  Let  q(s)  be  the  estimated  probability  function 
derived  from  a  histogram  which  is  compiled  from  one  of  the  following:  (1)  the  test 
image,  (2)  some  other  image,  or  (3)  a  composite  of  a  number  of  images  which  may 
but  need  not  include  the  test  image.  Except  for  the  special  case  in  which  the 
histogram  for  q(s)  is  compiled  from  just  the  test  image,  this  histogram  must  be 
adjusted,  if  necessary,  to  guarantee  that  n(s)  >  0  for  all  (possible)  s,  as  described 
above,  to  ensure  that  there  is  no  s  for  which  p(s)  >  0  while  q(s)  =  0.  Let  h(s)  be 
the  Huffman  code  word  length  for  symbol  s  derived  from  the  same  histogram  that 
produces  q(s).  Then  Lj,  as  defined  in  Equation  (1),  is  a  good  predictor  of  the 
average  Huffman  code  word  length  per  symbol,  L*,  as  defined  in  Equation  (2),  and, 
as  experiments  described  below  show,  is  a  better,  more  realistic  approximation 
than  the  optimistic  value  given  by  the  entropy.  I(s)  in  Equation  (1)  can  be  thought 
of  as  the  information  code  word  length,  analogous  to  h(s),  the  Huffman  code  word 
length  in  Equation  (2).  When  q(s)  =  p(s),  the  l(s)  values  are  optimal,  and  Lj  is 
minimum,  just  as  an  optimized  Huffman  code  derived  from  p(s)  gives  the  minimum 
average  Huffman  code  word  length.  If  q(s)  is  derived  from  a  composite  histogram 
of  many  images,  as  is  typically  done  to  generate  a  "default"  Huffman  code,  then 
both  Lj  and  Lh  increase.  Thus,  one  can  think  of  l(s)  as  "optimal"  or  “default" 
information  values  for  q(s)  equal  or  not  equal  to  p(s),  analogously  to  "optimal"  or 
"default"  Huffman  code  word  lengths. 

4.5  Experiments 

The  purpose  of  the  experiments  was  to  observe  the  behavior  of  the  average 
information  value  and  Huffman  code  word  length  when  a  test  image  is  coded  with 
optimized  and  non-optimized  information  values  and  Huffman  code  words. 

The  experiments  were  performed  on  the  following  data,  one  data  set  per 
alphabet: 

(1)  A  histogram  of  symbols  that  actually  occur  for  each  of  eight  test  images, 
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(2)  A  histogram  for  each  image  with  possible,  but  non-occurring  symbols 
assigned  counts  of  1, 

(3)  A  composite  histogram  consisting  of  the  sum  of  the  histograms  for  all 
eight  images,  with  possible  symbols  that  never  occur  in  any  of  the  eight 
images  assigned  counts  of  1 . 

The  test  images  are  fully  sampled  and  compressed  with  a  compression  scale 
factors  of  25,  which  is  the  JPEG  default  setting. 

Probability  functions  p(s)  were  computed  for  each  image  from  histograms 
(1).  Probability  functions  q(s)  were  computed  from  histograms  (2)  and  (3). 
Histograms  (2)  and  (3)  were  submitted  to  the  modified  JPEG  simulator  to  obtain 
Huffman  code  length  data  from  each  histogram  for  all  the  possible  symbols. 

For  each  alphabet,  L|,  the  average  information  value,  was  computed  by 
Equation  (1)  for  each  test  image  and  for  each  of  the  q(s)  functions  derived  from 
histograms  (2)  and  (3)  to  show  how  L,  behaves  when  the  source  of  coding 
information  (called  the  "code  source"  in  the  following  tables)  is  the  same  image,  a 
different  image,  or  the  composite.  Similarly,  the  average  Huffman  code  word 
length,  Lh,  was  computed  by  Equation  (2)  for  each  test  image  and  for  each 
Huffman  code  word  length  table  derived  from  histograms  (2)  and  (3). 

It  should  be  noted  that,  for  any  given  image,  the  total  number  of  coded  bits 
for  all  four  alphabets  is  less  than  the  total  number  of  bits  in  the  compressed  bit 
stream.  The  latter  is  comprised  also  of  SSSS  bits  per  symbol,  plus  overhead, 
where  SSSS  is  the  "size"  component  of  the  symbol.  The  sensitivity  of  L;  and  Lh  to 
different  code  sources  is  therefore  masked  somewhat  by  the  presence  of  these 
other  bits;  hence  the  sensitivity  of  the  overall  data  compression  is  less  than  that  of 
Lj  or  Lh. 

4.6  Results 

Table  1  gives  the  names  of  the  test  images  to  which  the  test  image  and 
code  source  numbers  in  the  remaining  tables  correspond.  Code  source  9  is  derived 
from  the  composite  histogram  of  all  eight  test  images,  and  is  therefore  not 
considered  as  a  single  test  image.  Images  4  and  5,  LATOUR1  and  LATOUR2,  are 
actually  the  left  and  right  halves  of  the  LATOUR  image,  which  was  too  large  to 
process  through  the  JPEG  simulator. 

Tables  2  through  5  show  the  experimental  results  for  the  four  alphabets. 
Each  row  represents  different  test  images  coded  by  the  same  code  source,  and 
each  column  represents  a  single  test  image  coded  by  different  code  sources.  In 
each  table  cell,  the  top  value  is  the  average  information  value,  and  the  bottom 
value  is  the  average  Huffman  code  length. 
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TABLE  1 

KEY  TO  TEST  IMAGES  AND  CODE  SOURCES 


Number 

Test  Image  /  Code  Source 

1 

APPOTLAB 

2 

Cl  LAB 

1  3 

FAXBALLS 

I  4 

LATOUR1 

1  5 

LATOUR2 

I  6 

PM003_NL 

7 

SA001_NL 

8 

TOYSLAB 

9 

Composite  (code  source  only) 

4.7  Observations  and  Discussion 

In  each  table  column  (same  test  image,  different  coding  sources),  both  the 
average  information  value  and  Huffman  code  word  length  are  minimum,  as 
expected,  when  the  coding  source  is  the  same  image,  yielding  optimized 
information  values  and  Huffman  codes,  i.e.,  q(s)  =  p(s). 

When  the  code  source  is  other  than  the  test  image,  the  average  information 
value  computed  from  Equation  (1)  is,  in  most  but  not  all  cases,  a  better  predictor 
of  the  average  Huffman  code  word  length  than  the  entropy,  which  is  the  average 
information  value  when  the  code  source  is  the  test  image.  In  a  large  number  of 
cases,  L;  and  Lh  agreed  to  within  hundredths  of  a  bit  per  symbol.  In  some  cases 
the  values  differed  by  a  few  tenths  of  a  bit  per  symbol. 

The  composite  histogram,  in  most,  but  not  all  cases,  produced  Huffman 
codes  that  were  at  least  as  efficient  as  Huffman  codes  produced  by  any  single 
code  source  other  than  the  test  image  itself. 

The  Huffman  code  length  averages  in  rows  4  and  5  (LATOUR1  and 
LATOUR2)  of  the  DC  luminance  table  are  identical  for  a  given  test  image,  although 
the  average  information  values  are  different,  different  by  approximately  two  tenths 
of  a  bit  per  symbol  in  column  7.  Further  investigation  revealed  that  these  two 
images  produced  identical  Huffman  code  word  lengths  for  each  DC  luminance 
symbol  despite  considerably  different  histograms.  Thus,  Huffman  code  word 
lengths  are  less  sensitive  to  differences  in  probability  functions  than  are 
information  values. 
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In  a  number  of  cases,  but  never  when  the  coding  source  was  the  test  image, 
the  average  Huffman  code  word  length  was  /ess  than  the  average  information 
value.  This  result  was  so  counter-intuitive  (because  the  average  information 
formula  closely  resembles  the  entropy  formula)  that  one  of  these  cases  in  the  DC 
luminance  alphabet  was  verified  by  hand  from  the  raw  data  (with  the  aid  of  a 
spreadsheet  program)  to  prove  that  the  result  was  real,  and  not  due  to  a  bug  in  the 
computer  programs.  Evidently,  although  the  entropy  (q(s)  =  p(s))  is  the  absolute 
lower  bound  on  both  the  average  information  value  and  average  Huffman  code 
word  length,  the  average  information  value  computed  when  q(s)  is  different  from 
p(s)  is  not  a  lower  bound  on  the  average  Huffman  code  word  length  when  the 
Huffman  code  is  derived  from  q(s). 

4.8  Conclusion 

Averages  of  information  values  derived  from  the  same  probability  function  as 
that  which  produces  a  Huffman  code  are  easy  to  compute  without  actually 
generating  Huffman  codes,  and  are  good  predictors  of  Huffman  code  performance. 
When  this  probability  function  is  the  same  as  that  of  a  test  image,  the  average 
information  is  the  entropy,  and  the  resulting  Huffman  code  is  optimal  for  that 
image.  When  the  probability  function  is  different  from  that  of  the  test  image,  the 
resulting  average  information  value  and  Huffman  code  word  length  both  increase, 
the  former  sometimes  increasing  more  than  the  latter,  but  the  two  are  generally  in 
better  agreement  than  are  the  degraded  Huffman  code  and  the  entropy. 
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Code  Source 


TABLE  2 

AVERAGE  INFORMATION  VALUES  AND 
HUFFMAN  CODE  LENGTHS  FOR  DC  LUMINANCE 

Test  Image 


1 

2 

3 

4 

5 

6 

7 

8 

1 

2.4851 

2.5857 

3.5812 

3.3677 

3.3456 

3.0596 

2.5169 

2.5499 

2.6887 

2.6584 

2.6606 

2.6656 

3.4370 

3.1387 

3.0655 

2.9281 

2 

3.0786 

3.1395 

2.9656 

3.0526 

2.9775 

3.0726 

3.0750 

3.1601 

3.0556 

3.1503 

3.0102 

3.0317 

2.9685 

3.1119 

3.0117 

3.1197 

3 

3.2784 

3.6471 

3.3948 

3.6938 

2.6314 

2.6710 

3.0059 

3.2593 

2.9931 

3.2209 

2.9134 

3.1591 

2.8779 

2.9741 

3.0490 

3.2579 

4 

2.6047 

2.6888 

3.8285 

3.8553 

3.1357 

3.1189 

2.4200 

2.4554 

2.5533 

2.5696 

2.7001 

2.8008 

3.1704 

3.1056 

2.9682 

2.9641 

5 

2.6579 

2.6888 

3.6916 

3.8553 

2.9919 

3.1189 

2.4424 

2.4554 

2.5300 

2.5696 

2.7206 

2.8008 

2.9704 

3.1056 

2.8728 

2.9641 

6 

2.6194 

2.8733 

3.4208 

3.6378 

2.9964 

3.0261 

2.5947 

2.8111 

2.7266 

2.9542 

2.5267 

2.6232 

3.2636 

3.4861 

3.0135 

3.2457 

7 

3.2903 

3.3140 

3.4415 

3.3779 

2.9489 

3.0372 

3.0021 

3.0576 

2.9119 

2.9704 

3.2852 

3.3750 

2.6299 

2.6968 

2.9233 

2.9637 

8 

2.7459 

2.7887 

3.2478 

3.4369 

2.8500 

3.0255 

2.5973 

2.6167 

! 

' 

2.6182 

2.6416 

2.7791 

2.9410 

2.7612 

2.8359 

2.7738 

2.8407 

9 

2.6190 

2.6273 

3.2625 

3.4181 

2.8879 

2.8900 

2.5218 

2.5115 

2.5867 

2.5912 

2.6530 

2.6347 

2.8950 

2.9731 

2.7995 

2.8566 
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TABLE  3 

AVERAGE  INFORMATION  VALUES  AND 
HUFFMAN  CODE  LENGTHS  FOR  AC  LUMINANCE 

Test  Image 


Code  Source 


1 

2 

3 

4 

5 

6 

7 

8 

1 

3.3138 

3.3501 

3.6623 

3.6663 

3.3955 

3.4344 

3.1771 

3.2292 

3.1872 

3.2348 

3.8153 

3.8417 

3.3160 

3.3514 

3.2420 

3.2814 

2 

3.5251 

3.5602 

3.4469 

3.4839 

3.4555 

3.4060 

3.4518 

3.4526 

3.4120 

3.4145 

3.7836 

3.8346 

3.2717 

3.2709 

3.4038 

3.4092 

3 

3.5668 

3.5520 

3.6724 

3.7092 

3.1553 

3.1986 

3.3465 

3.3305 

3.3649 

3.3579 

3.7772 

3.7974 

3.3627 

3.3978 

3.3923 

3.3966 

4 

3.3492 

3.4202 

3.7641 

3.9051 

3.3608 

3.3818 

3.1395 

3.1781 

3.1651 

3.2166 

3.8963 

3.9965 

3.3398 

3.4312 

3.2437 

3.3015 

5 

3.3425 

3.4196 

3.7160 

3.9106 

3.3540 

3.3760 

3.1449 

3.1839 

3.1599 

3.2183 

3.8817 

4.0132 

3.2996 

3.4305 

3.2306 

3.2987 

6 

3.4598 

3.4821 

3.5465 

3.5403 

3.2622 

3.3211 

3.3140 

3.3573 

3.3217 

3.3551 

3.6492 

3.6751 

3.3186 

3.3266 

3.3395 

3.3662 

7 

3.4407 

3.5643 

3.5511 

3.5454 

3.3576 

3.4297 

3.2901 

3.4354 

3.2644 

3.3926 

3.8644 

3.9115 

3.1993 

3.2452 

3.2765 

3.3893 

8 

3.3429 

3.3859 

3.6421 

3.6388 

3.3295' 

3.3742 

3.17563 

3.2349 

3.1786 

3.2294 

3.8362 

3.8867 

3.2612 

3.2794 

3.2168 

3.2601 

9 

3.3519 

3.4012 

3.5306 

3.5566 

3.3024 

3.3200 

3.2091 

3.2586 

3.20643 

3.2543 

3.7280 

3.7713 

3.2317 

3.2609 

3.2394 

3.2809 
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TABLE  4 

AVERAGE  INFORMATION  VALUES  AND 
HUFFMAN  CODE  LENGTHS  FOR  DC  CHROMINANCE 

Test  Image 


1 

2 

3 

4 

5 

6 

7 

8 

1 

1.9022 

1 .9845 

3.0695 

2.9388 

2.3880 

2.2906 

1.8629 

1.8728 

2.1101 

2.1322 

1.9979 

1 .8893 

2.1539 

2.2379 

2.1060 

2.1775 

2 

2.2928 

2.1654 

2.5533 

2.6391 

2.3350 

2.3011 

2.2262 

2.1250 

2.2864 

2.1920 

2.2242 

2.1897 

2.3138 

2.1544 

2.2984 

2.1636 

3 

2.0955 

2.2388 

2.7154 

2.9496 

2.1908 

2.2689 

1.9755 

2.0782 

2.1394 

2.2990 

1.9511 

1.9914 

2.1855 

2.3486 

2.1655 

2.3398 

4 

1.9649 

1.9845 

3.1546 

2.9388 

2.3779 

2.2906 

1.8258 

1.8728 

2.0881 

2.1322 

1.9217 

1.8893 

2.1785 

2.2379 

2.1088 

2.1775 

5 

1.9669 

1.9845 

2.8954 

2.9388 

2.2889 

2.2906 

1.8678 

1.8728 

2.0438 

2.1322 

1.9692 

1.8893 

2.0779 

2.2379 

2.0391 

2.1775 

6 

2.0434 

1.9845 

2.9630 

2.9369 

2.2917 

2.2906 

1.8833 

1.8728 

2.1504 

2.1322 

1.8619 

1.8891 

2.2915 

2.2379 

2.2175 

2.1775 

7 

2.0179 

2.1654 

2.9592 

2.6410 

2.3477 

2.3011 

I 

1.9682 

2.1250 

2.0934 

2.1920 

2.1260 

2.1898 

2.0272 

2.1544 

2.0353 

2.1636 

8 

1 .9909 
1.9974 

2.9647 

2.9942 

2.3374 

2.5100 

1.9112 

2.0080 

2.0600 

2.1614 

2.0585 

2.2160 

2.0410 

2.1550 

2.0230 

2.1199 

9 

1.9481 

1.9845 

2.8150 

2.9369 

2.2610 

2.2906 

1.8752 

1.8728 

2.0499 

2.1322 

1.9644 

1.8891 

i 

2.0800 

2.2379 

2.0469 

2.1775 
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Code  Source 


TABLE  5 

AVERAGE  INFORMATION  VALUES  AND 
HUFFMAN  CODE  LENGTHS  FOR  AC  CHROMINANCE 

Test  Image 


1 

2 

3 

4 

5 

6 

7 

8 

1 

2.5979 

2.6858 

3.5480 

3.4500 

2.5075 

2.6954 

2.4565 

2.6227 

2.7258 

2.8066 

3.4287 

3.4582 

2.5614 

2.6514 

2.5253 

2.6235 

2 

2.8616 

2.8034 

3.1297 

3.1584 

2.5756 

2.5139 

2.6839 

2.6081 

2.8279 

2.7860 

3.1802 

3,1494 

2.7116 

2.6789 

2.7823 

2.7257 

3 

2.9204 

2.9014 

3.4706 

3.4833 

2.2519 

2.2717 

2.4698 

2.4700 

2.8481 

2.8499 

3.2273 

3.2091 

2.7638 

2.7763 

2.8158 

2.8126 

4 

2.6897 

2.7343 

3.5002 

3.6255 

2.3452 

2.3830 

2.3672 

2.3928 

2.7063 

2.7555 

3.2425 

3.3855 

2.6120 

2.6438 

2.5969 

2.6369 

5 

2.6431 

2.7280 

3.3310 

3.5765 

2.3867 

2.3637 

2.4072 

2.3895 

2.6664 

2.7441 

3.1845 

3.3143 

2.55322 

2.6511 

2.5564 

2.6382 

6 

2.8069 

2.8578 

3.2507 

3.1879 

2.3798 

2.5717 

2.4884 

2.6498 

2.7601 

2.8240 

3.0451 

3.1180 

2.7083 

2.7414 

2.73512 

2.7891 

7 

2.6375 

2.7231 

3.4402 

3.4087 

2.4697 

2.6515 

2.4627 

2.6113 

2.7080 

2.7974 

3.3821 

3.4519 

2.5204 

2.6113 

2.5380 

2.6326 

8 

2.6129 

2.6960 

3.6007 

3.5679 

2.5221 

2.7026 

2.4539 

2.6220 

2.7266 

2.8205 

3.5072 

3.5409 

2.5442 

2.6330 

2.5165 

2.6034 

9 

2.6394 

2.7293 

3.2740 

3.2090 

2.3933 

2.5501 

2.4327 

2.5796 

2.6735 

2.7486 

3.1534 

3.1941 

2.5520 

2.6248 

2.5600 

2.6526 
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5.0  FUTURE  PLANS 


5.1  Determining  Optimal  Default  Huffman  Code 

The  main  conclusion  reached  from  the  Huffman  coding  study  was  that  the 
tested  "default"  Huffman  codes  came  from  an  insufficiently  large  sample  of  the 
Huffman  coded  symbols  or  that  there  was  an  insufficient  mix  of  image 
characteristics. 

5.1.1  Criterion  for  Determining  Required  Sample  Sizes 

Suppose  that  a  symbol  s  has  probability  p(s),  and  one  attempts  to  estimate 
the  value  of  p(s)  by  sampling  a  large  number  of  symbols  at  random.  The  smaller 
the  value  of  p(s),  the  larger  must  be  the  number  of  samples  to  obtain  a  reliable 
estimate  of  p(s).  If  one  takes  N  samples,  where  N  is  a  large  number,  then  the 
expected,  but  by  no  means  necessarily  the  actual,  number  of  times  that  symbol  s 
occurs  is  N  p(s).  For  p(s)  small,  the  standard  deviation  about  N  p(s)  is  given  by 


o  =  /wpM  • 

For  the  number  of  occurrences  of  s  to  have  a  good  chance  of  being  within 
some  fraction,  f,  of  N  p(s)  requires  that  the  standard  deviation  be  f  N  p(s),  from 
which 


N  >  1  /fpfs)  I2]. 

Thus,  the  smaller  p(s),  the  larger  the  number  of  samples  required.  For 
example,  if  p(s)  =  0.001,  then  100,000  symbols  would  have  to  be  sampled  for 
there  to  be  a  good  chance  (about  68%  probability,  i.e.  within  one  standard 
deviation  of  a  normal  probability  density  function)  for  symbol  s  to  occur  within  10 
percent  of  the  expected  number  of  times. 

The  probability  function  p(s)  itself,  however,  depends  upon  the  mix  of  the 
image  characteristics  in  the  collection  of  images  from  which  the  samples  are 
drawn.  Different  kinds  of  images  have  different  probability  functions;  for  example, 
busy  vs.  bland,  line  drawings  vs.  landscapes.  One  must  in  effect  pour  the  symbols 
from  a  large  number  of  images  representing  all  kinds  of  images  likely  to  be 
compressed  into  a  common  pool,  and  then  create  a  composite  histogram  (for  each 
alphabet)  either  from  the  entire  pool  or  from  sufficiently  many  random  samples 
taken  from  that  pool,  where  the  number  of  samples  required  is  dictated  by  the 
above  sampling  criterion. 
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5.1.2  Approach  to  Best  Default  Huffman  Codes 

The  first  step  is  to  collect  as  many  test  images  as  possible  to  produce  a 
thorough  mix  of  the  various  image  characteristics.  Ideally,  the  mix  of  image  types 
in  the  collection  would  be  in  the  same  proportions  as  in  the  "universe"  of  all 
images  ever  transmitted.  If  this  "universal  mix"  cannot  be  estimated  in  any 
straightforward  manner,  then  one  must  simply  collect,  at  random,  as  many 
different  kinds  of  images  as  possible. 

The  next  step  is  to  pour  randomly  chosen  subsets  of  the  available  images 
into  separate  pools,  and  build  composite  histograms  from  each  pool  as  well  as 
from  the  entire  set.  If  the  Huffman  codes  generated  from  the  subsets  perform  to 
within  a  few  percent  of  one  another  and  of  the  entire  set,  then  one  can  conclude 
that  the  various  image  types  are  sufficiently  well  represented  in  the  collection. 

If  a  symbol  pool  is  too  large  to  generate  a  composite  histogram  or  a  Huffman 
code,  one  can  employ  the  theory  presented  in  Section  4.0  to  determine  the 
sensitivity  of  the  average  information  to  errors  in  probability  estimates.  This 
sensitivity  function  and  the  sampling  criterion  given  above  determine  the  required 
sample  size. 
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Appendix  A 


Image  Histograms 


IMAGE  HISTOGRAMS 


The  following  plots  are  of  histograms  of  the  four  Huffman  coded  symbol 
sets:  DC  luminance,  AC  luminance,  DC  chrominance,  and  AC  chrominance.  The 
composite  histograms  and  those  of  the  FAXBALLS  image  are  shown.  The 
histograms  apply  to  fully  sampled  images  and  a  compression  scale  factor  of  24. 
The  composite  histogram  was  used  to  generate  the  Delta  Huffman  code,  which 
was  one  of  the  "default"  Huffman  codes  tested  in  experiments  reported  in  Section 
3.0. 


The  FAXBALLS  image  was  chosen  because  the  percentage  difference,  4.6 
percent,  between  the  compressed  bit  count  produced  by  the  Delta  Huffman  code 
and  by  the  optimized  Huffman  code  was  greater  for  FAXBALLS  than  for  any  other 
image  in  the  test  set.  This  selection  shows  how  widely  image  histograms  can 
differ  from  the  composite  and  yet  produce  only  a  few  percent  degradation  in  data 
compression.  Actually,  the  percentage  difference  in  bit  count  owing  to  the 
Huffman  coding  alone  was  6.3  percent;  the  4.6  percent  figure  is  based  on  the  total 
bit  count,  to  which  the  Huffman  coding  contributed  approximately  65  percent. 
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Composite  AC  Histogram  for  Luminance 


Composite  DC  Histogram  for  Chrominance 


Composite  AC  Histogram  for  Chrominance 


DC  Histogram  of  FAXBALLS  Luminance 


ssss 


AC  Histogram  of  FAXBALLS  Luminance 
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DC  Histogram  of  FAXBALLS  Chrominance 


AC  Histogram  of  FAXBALLS  Chrominance 
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Appendix  B 


Contribution  DIO  -  Study  Group  8 


UIT-  Secteur  de  la  normalisation  des  telecommunications 

ITU  -  Telecommunication  Standardization  Sector 

UIT  -  Sector  de  Normalizacion  de  las  Telecomunicaciones 


Commission  d'etudes 
Study  Group 
Comision  de  Estudio 


Contribution  tardive  1  T\  ^ f) 
Delayed  Contribution  \  \  )  /  (y 

Contribution  tardia  J 


Geneve,  27  avril  -  6  mai  1993 
Geneva,  27  April  -  6  May  1993 
Ginebra,  27  de  abril  -  6  de  mayo  de  1993 


Texte  disponible  seulement  en 
Text  available  only  in 
Texto  disponible  solamente  en 


Questions: 

SOURCE:  JAPAN 

Title:  Experimental  result  for  deolding  JPEG  default  Huffian  tables 
on  the  oolor  facsimile  standardization 


1.  Introduction 

In  the  associated  rapporteur  (roup  leeting  for  Color  Extension 
for  Group4  Facsiiile  held  in  Nobeiber  1982.  It  is  decided  that  in  the 
case  where  the  Huffian  tables  are  not  transiltted  by  the  sending 
terminal,  default  Huffman  tables  must  be  used.  Under  this  decision, 
Japan  has  been  carried  out  an  experiment  for  providing  default  Huffian 
tables.  This  contribution  shows  a  result  of  our  experiment. 


2.  Discussion  and  Proposals 

2.1  Experimental  procedure  for  deciding  default  Huffian  tables 

1) Test  Images 

•  High  Resolution  Digital  Test  Image  (NTT) 

pm003_nlab.  0~  2(2048*2048) 
saOOl.n lab. 0~  2(2048*2048) 

•  SCID  Test  Image  (ISO  TC130) 

Cafe  ter ia(204 8*2560) 

2) CoIor  Space 

CIE  LAB 
S)Subsampl Ing 

(1 : 1 : 1)  and  (4:2:2) 

In  the  case  of  (4:2:2).  1/2  subsaapllng  is  carried  out  to  "a”  and 
”b”  color  components  after  horizontal  low-pass-f I  iter ing  weighted  i:2:i 
Filtering  result  is  calculated  by  rounding. 


Sup  1  e  values  whioh  are  suited  outside  of  the  Iiace  boundary  are 
replicated  fros  the  seeple  values  at  the  boundary  to  provide  slsslng 
edge  values  In  filtering  calculation. 

After  filtering,  subsaspllng  is  realized  by  taking  odd  nuibered  pixel 
fros  left  edge  In  one  pixel  line. 

4) Quant  Izat  Ion 
Quantization  step  size  = 

{Recossended  Quantization  Table  *  Scaling  factor  /  60} 

(Round  to  integer) 

Reconended  Quantization  Tables  are  shown  at  page  161  In  ISO  OIS 
10918-1. 

(L**»TableI.  1 .  a ‘b-* Table!. 2) 

Scaling  factors  used  for  above  oaloulatlon  are  as  follows. 

8-*  2. Obit/color-pixel 
24-*  1. Obit/color-pixel 
71-*-  O.Bbit/color-pixel 

5) feightlng  Function 

leightln  Function  14  (4*1~6,  214  =  1.0)  is  defined  for  cosblnatlon 
of  2  subsaipl ing  rates  and  S  scaling  factors  as  follows. 


Subsaspl ing  rate 


Scaling  factor 


8 

24 

71 

ISIS  1 

4*1 

4*2 

4*3 

4:2:2 

J*4 

4=6 

4*6 

11=2/12,  12=4/12,  13=2/12 
14=1/12,  16=2/12,  16=1/12 


6)Derivlng  Huffsan  Table 

The  procedure  for  deriving  DC  and  AC  default  Huffsan  tables  for  L 
and  a.b  color  .cospx>nents- are 'Shown -in  ANNEX  A.  The  derived  default 
Huffsan  tables  are  shown  In  ANNEX  8. 
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2.2  Comparison  of  coapresslon  rate 

« 

Table.  1  shots  oosparlson  of  compression  rates  for  following 
3  Huffsan  tables. 

•  Optimized  Huffman  tables  for  each  Image  (Custom  tables) 
.•Default  Huffman  tables  (ANNEX  B) 

•  Recomended  Huffman  tables  from  JPEG 

As  a  result,  difference  of  Coapresslon  rate  between  default  tables 
and  recoimended  tables  Is  very  small. 

When  compression  rate  (blt/pel)  Is  more  than  1.0,  difference  Is 
•0.8  —  +  1.8X.  (’-"  means  that  recommended  tables  realize  higher 
coapresslon  than  default  tables.) 

then  compression  rate  (blt/pel)  Is  smaller  than  1.0,  difference  Is 
+1.2  ~  +3. 6X. 
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Table.  I  Comparison  of  compression  rate 

Image:  pm003  Top  row  :  Optimized  Huffman  tables 

Middle  row  :  Default  Huffman  tables 

Bottom  row  :  Recommended  Huffman  tables 

t.?i _ 


9 

H 

7  1 

Coded  Data  Bit  Rate 
Amount(bits) (bits/pel) 

Coded  Data  Bit  Rate 
Amount (bits) (bits/pel) 

Coded  Data  Bit  Rate 
Aaount(bits)(blts/pel) 

1:1:1 

8,490,219  2.024226 
8,596,292  2.049516 
8,656,885  2.063962 

4,377,975  1.043791 
4,439,235  1.058396 
4,528,658  1.079716 

2,246,747  0.535666 
2,475,670  0.590246 
2,556,416  0.609497 

4:2:2 

6,510,892  1.552318 
6,584,480  1.569862 
6,647,437  1.584873 

3,484,759  0.830831 
3,518,823  0.838953 
3.593,834  0.856837  ' 

1,820.211  0.433972 
1,957,006  0.468971 
2,036,570  0.485556 

I  m  e 

g  e  :  s  a  O  0  1 . 

9 

24 

7  1 

Coded  Data  Bit  Rate 
Amount(bits) (bits/pel) 

Coded  Data  Bit  Rate 
Amount (bits) (bits/pel) 

Coded  Data  Bit  Rate 

Amount (bits) (bits/pel) 

1:1:1 

10,831.817  2.582506 
10,977,181  2.617164 
11,016,144  2.626453 

5,679,529  1.354105 

'  5.765,274  1.374548 

5,735,731  1.381810 

2,502.094  0.598546 
2,781,095  0.663065 
2.819,801  0.672293 

4:2:2 

8,748,416  2.085785 
8,863,644  2.113257 
8,848,053  2.109540 

.4.760,869  1.135080 
’4.832,815  1.152233 
.4.835,960  1.152983 

2.192,600  0.522757 
2.377.852  0.565924 
2,405.835  0.573596 

Image:Cafeteria 

9 

24 

7  1 

Coded  Data  Bit  Rate 
Aaount(bits)(blts/pel) 

Coded  Data  Bit  Rate 
Amount(blts)(bits/pel) 

Coded  Data  Bit  Rate 
Amount (bits) (bits/pel) 

1:1:1 

22,985,467  4.384130 
23,691,375  4.518771 
23,519,154  4.485923 

12,946,460  2.469341' 
13,035,925  2.486405 
13,039,429  2.487074 

6,589,172  1.256785 
6,697,457  1.277439 
6,738,258  1.285221 

4:2:2 

18,831,903  3.591900 
19,410,366  3.702234 
19,253.435  3.672301 

10.971,185  2.092588 

11.063,594  2.110213 
11,034,374  2.104640 

5.719,126  1.090837 
5,790,657  1.104480 
5,809,481  1.108071 
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3.  Conclusion 

This  contribution  presents  the  experlsentel  results  for  decldlns 
default  Huffaan  tables. 

Default  Huffaan  tables  are  derived  froa  a  statistical  data  froa 
3  test  teases,  and  coapression  rates  by  uslns  this  default  tables  are 
cospared  vlth  the  optlalsed  Huffaan  tables  for  each  lame  (custoa  tables) 
and  recoiaended  Huffaan  tables  froa  JPEG. 

As  a  result,  difference  of  Coapression  rate  between  default 
tables  and  recoaaended  tables  Is  very  snail. 


-  S  - 


FROM  NEC 


070. 


if  »  i  * 


ANNEX  A 

Procedure  for  deriving  default  Huffian  tables 

©  For  each  Iiage,  deriving  PDJ(n)  and  PAJ(r.a)  for  each  coibination(J) 
of  subsaipling  rate  and  scaling  faotor. 

.PDJ(n):  Probability  of  DC  difference  sagnltude  categories 

(see  page  102  tn  D1S.  10918-1) 
n=0~  11.  ZPDJ(n)=i.0 
n 


PAJ(r.s):  Two  dlsentlonal  probability  of  AC  run/size  combination 
(see  page  104  In  DIS  10918-1) 
r*0~ 15(Run:runlength  of  zero  coefficients) 
s=0~ 10(Slze:category  of  non-zero  coefficient) 
(0,0)-*E0B 
( 15, 0)-*  ZRL 

(1,0)  l*l~14  undeflnded 
Z  PAJ (r , s)*l. 0 
r  ,s 


©  For  each  lease,  deriving  weighted  probability  PD(n)  and  PA(r.s). 

PD(n)  =  Z  1J*PDJ  (n) 

J 

PA(r ,  s)  =  2  IJ*PAJ(r,s) 

J 

©  Deriving  averaged  probability  PPD(n)  and  PPA(r.s)  aiong  plural  iaages. 

PPD(n)=  Z  PD(n)  /  (nuiber  of  iiages) 
liases 

PPA(r, s)=  Z  PA(r.s)  /  (nuiber  of  Iiages) 
luges 

©  Deriving  B ITS, HUFFVAL 

Deriving  BITS  and  HUFFVAL  froi  PPD(n)  and  PPA(r;e),  according  to 
Chapter  K.2  ,  page  181-188  ,  in  DIS  10918-1. 

BITS:  list  of  code  length 

HUFFVAL:  Hat  of  values 
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©Derlvim  EHUFCO,  EHUFS I . 

* 

Derivin*  EHUFCO  and  EHUF51  froi  BITS  and  HUFFVAL.  accordina  to 
Chapter  C  ,  pace  63*66  .  In  D IS  10818-1. 

EHUFCO:  code  table 
EHUFS I :  code  alze  table 
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ANNEX  8 


*** 


L* 


**  BITS  In  hex.  ** 

00  01  05  01  01  01 
**  HUFFVAL  In  hex.  ** 

00  01  02  03  04  ^05^ 
Category  C  J' 

0 
1 
2 

3 

4 

5 

e 

7 

8 
8 

10 
11  • 


Huffean 

tables 

.  table 

*** 

01  01 

01 

00 

a 

00 

00  00  00  00  00 

06  07  08 

Code  word 

09 

OA 

OB 

2 

3 

3 

3 

3 

3 

4 

5 

6 

7 

8 
9 


00 

010 

on 

100 

101 

110 

1110 

11110 

111110 

1111110 

liniiio 

111111110 


«** 


4*,  b*  DC  Huffman  table 


«** 


**  BITS  in  hex.  ** 

00  03  01  01  01 
**  HUFFVAL  in  hex 
00  01  02  03 
Category  Cod 


01 

s* 

04  05 


01 

01 

01 

01 

01 

00 

06 

07 

08 

09 

OA 

OB 

Code 

word 

0 

1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 


2 

2 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 


oo  00  00  00 


00 

01 

10 

no 

1110 

11110 

111110 

1111110 

11111110 

111111110 

.1111111110 

11111111110 


*** 


L* 


AC  Huffman  table 


*** 


**  BITS  in  hex.  ** 

00  01  03  02  04  03 

**  HUFFVAL  in  hex.  ** 

04  11 

81  91 

FO  16 
54  83 

27  65 

86  B5 
48  49 

88  89 

B9  BA 


01 

00 

02 

03 

•  71 

07 

14 

32 

62 

72 

D1 

El 

B2 

FI 

26 

44 

A3 

D2 

84 

18 

C4 

F3 

56 

75 

29 

2A 

39 

3A 

78 

79 

7A 

87 

AA 

B6 

B7 

B8 

D8 

09 

DA 

E3 

F9 

FA 

Run/Size 

Code 

04  05  07  08  06 


05  12 

15 


21 


23  42 

43  53 

C2  17 
F2  37 


34 
93 
E2 

D3  47  66 

4A  57  58 

8A 


C5 

£6 


97  98 

C6  C7 
E7  E8 


0/0 

0/1 

0/2 

0/3 


3 

2 

3 

3 


Code  vord 
010 
00 
Oil 
100 


31 

52 

82 

38 

46 

76 

59 

99 

C8 

E9 


92 

B3 


04  01  01  02  6F 


06  41 

A1  B1 


25 

09 


65  A4 
95  19 

5A  67 
9A  A5 
C9  CA 
EA  F4 


51  13 

Cl  08 
35  63 

45  64 

B4  C3 
38  96 

88  69 

A6  A  7 
04  05 

F5  F6 


22  61 
24  33 

73  A2 

74  94 

28  85 

OA  1A 
6A  77 
AS*  A9 
OS  07 
F7  F8 
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PROM  NEC 


1993.  * .  3  17113 


P.  1  1 


0/4 

4 

0/5 

5 

0/6 

6 

0/7 

8 

0/8 

10 

0/8 

18 

0/A 

16 

1/1 

4 

1/2 

5 

1/3 

7 

1/4 

8 

1/5 

9 

1/6 

11 

1/7 

16 

1/8 

16 

1/9 

16 

1/A 

16 

2/1 

& 

2/2 

7 

2/3 

9 

2/4 

10 

2/5 

12 

2/6 

16 

2/7 

16 

2/8 

16 

2/9 

16 

2/A 

16 

3/1 

5 

3/2 

8 

3/3 

10 

3/4 

.  11 

3/5  12 


3/6 

16 

3/7 

16 

3/8 

16 

3/9 

16 

3/A 

18 

4/1 

6 

4/2 

9 

4/3 

11 

4/4 

16 

4/5 

16 

4/0 

16 

4/7 

16 

4/8 

16 

4/9 

16 

4 /A 

16 

5/1 

6 

5/2 

9 

5/3 

11 

5/4 

16 

5/5 

16 

5/6 

18 

5/7 

16 

5/8 

16 

5/9 

16 

5/A 

10 

6/1 

7 

0/2 

10 

6/3 

12 

6/4 

16 

6/5 

10 

6/6 

If 

6/7 

If 

8/8 

ie 

6/9 

i< 

1010 
11000 
111000 
11110100 
1111110010 
1111111110011000 
1111111110111011 
1011 
11001 
mono 
11110101 
111110010 
11111110100 
1111111110010101 
1111111110100000 
1111111110111000 
1111111110111100 
11010 
1110111 
111110011 
1111110011 
111111110100 
111111111000111 
1111111110100001 
1111111110101011 
1111111110111101 
1111111110111110 
11011 
11110110 
1111110100 
11111110101 
111111110101  . 
1111111110010110 
1111111110100101 
1111111110111001 
1111111110111111 
1111111111000000 
111001 
111110100 
11111110110 
1111111110010000 
1111111110011001 
1111111110100110 
1111111110110100 
1111111111000001 
1111111111000010 
1111111111000011 
111010 
111110101 
11111110111 
1111111110010001 
1111111110100010 
1111111110101111 
1111111111000100 
1111111111000101 
1111111111000110 
1111111111000111 
1111000 
1111110101 
111111110110 
.1111111110011010 
1111111110100111 
1111111110110101 
1111111111001000 
1111111111001001 
1111111111001010 
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FROM  NEC 


19  9  3. 


*  .  Z  l  7  ■  1  4 


P.  I  2 


6/A 

16 

7/1 

7 

7/2 

10 

7/3 

12 

7/4 

16 

7/5 

16 

7/6 

16 

7/7 

16 

7/B 

16 

7/9 

16 

7 /A 

16 

8/1 

8 

8/2 

11 

8/3 

18 

8/4 

18 

8/5 

16 

8/6 

16 

8/7 

16 

8/8 

16 

8/9 

16 

8/A 

16 

9/1 

6 

9/2 

11 

9/3 

16 

9/4 

16 

9/5 

16 

9/6 

16 

9/7 

18 

9/8 

16 

9/9 

16 

9/A 

16 

A/1 

9 

A/2 

13 

A/3 

16 

A/4 

16 

A/5 

16 

A/6 

16 

A/7 

16 

A/8 

16 

A/9 

16 

A/A 

16 

B/l 

9 

B/2 

14 

B/3 

16 

B/4 

16 

B/5 

16 

B/6 

16 

B/7 

16 

B/8 

16 

B/9 

16 

B/A 

16 

C/1 

9 

C/2 

16 

C/3 

16 

C/4 

16 

C/5 

16 

C/6 

16 

C/7 

16 

C/8 

18 

C/9 

16 

C/A  . 

16 

D/1 

10 

D/2 

16 

D/3 

16 

D/4 

16 

D/5 

16 

1111111111001011 

1111001 

1111110110 

111111110111 

1111111110011011 

1111111110110000 

1111111110110110 

1111111111001100 

1111111111001101 

1111111111001110 

1111111111001111 

11110111 

11111111000 

1111111110010010 

1111111110011111 

1111111110101100 

1111111110110001 

1111111111010000 

1111111111010001 

1111111111010010 

1111111111010011 

11111000 

11111111001 

1111111110010011 

1111111110011100 

1111111110110111 

1111111110111010 

1111111111010100 

1111111111010101 

1111111111010110 

1111111111010111 

111110110 

1111111110000 

1111111110011101 

1111111110101000 

1111111111011000 

1111111111011001 

1111111111011010 

1111111111011011 

1111111111011100 

1111111111011101 

111110111 

11111111100010 

1111111110010111 

1111111110101001 

1111111110110010 

1111111111011110 

1111111111011111 

1111111111100000 

1111111111100001 

1111111111100010 

111111000 

1111111110010100 

1111111110101010 

1111111110101101 

1111111111100011 

1111111111100100 

1111111111100101 

1111111111100110 

1111111111100111 

1111111111101000 

1111110111 

1111111110011110 

1111111110110011 

1111111111101001 

1111111111101010 
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D/6 

16 

D/7 

16 

D/8 

*  16 

D/9 

16 

D/A 

16 

E/1 

10 

E/2 

16 

E/3 

16 

E/4 

16 

E/5 

16 

E/6 

16 

E/7 

16 

E/8 

16 

E/9 

16 

E/A 

16 

F/0 

10 

F/l 

15 

F/2 

16 

F/3 

16 

F/4 

16 

F/5 

16 

F/6 

16 

F/7 

16 

F/8 

16 

F/9 

16 

F/A 

16 

1111111111101011 
1111111111101100 
limiiiuioiioi 
1111111111101110 
1111111111101111 
1111111000 
•  1111111110100011 
1111111111110000 
1111111111110001 
1111111111110010 
1111111111110011 
1111111111110100 
1111111111110101 
1111111111110110 
1111111111110111 
1111111001 
lminiioobiio 
1111111110100100 
1111111110101110 
1111111111111000 
1111111111111001 
1111111111111010 
1111111111111011 
1111111111111100 
1111111111111101 
1111111111111110 


***  a*,  b*  AC  Huffaan  table  *** 

**  BITS  in  hex,  ** 


00 

02 

02 

01 

03 

02 

03 

05 

05 

07 

01 

01 

01 

02 

00 

7F 

**  HUFFVAL 
00  01 

,  in 
02 

hex 

11 

,  ** 

03 

12 

21 

31 

04 

41 

22 

51 

61 

05 

13 

32 

71 

FO 

81 

91 

A1 

B1 

Cl 

08 

14 

23 

33 

42 

D1 

El 

52 

15 

43 

62 

72 

FI 

07 

24 

34 

82 

82 

B2 

53 

A2 

C2 

16 

25 

83 

A3 

D2 

08 

35 

44 

63 

73 

63 

B3 

£2 

17 

54 

64 

84 

94 

A4 

B4 

C3 

26 

45 

36 

F2 

55 

74 

D3 

C4 

09 

OA 

18 

19 

1A 

27 

28 

29 

2  A 

37 

38 

39 

3A 

46 

47 

48 

49 

4  A 

56 

57 

58 

59 

5A 

65 

66 

67 

68 

69 

6A 

75 

76 

77 

78 

79 

7A 

85 

86 

87 

88 

89 

8A 

95 

96 

87 

98 

89 

9A 

A5 

A6 

A7 

A8 

A9 

AA 

B5 

B6 

B7 

B8 

B9 

BA 

C5 

C6 

C7 

C8 

C9 

CA 

D4 

D5 

D6 

D7 

D8 

D9 

DA 

E3 

E4 

E5 

E6 

£7 

E8 

E9 

EA 

F3 

F4 

F5 

F6 

F  7 

F8 

F9  FA 


Run/Size 

Code  length 

Code  word 

0/0 

2 

00 

0/1 

2 

01 

0/2 

3 

100 

0/3 

4 

1100 

0/4 

6 

111010 

0/5 

6 

11110110 

0/6 

10 

1111110110 

0/7 

16 

1111111110000001 

0/8 

16 

1111111110001111 

0/9 

16 

1111111110100111 

0/A 

16 

1111111110101000 

1/1 

3 

101 

1/2 

5 

11010 

1/3 

8 

'  11110111 

1/4 

10 

liimoiii 

1/5 

12 

111111110110 

1/6 

16 

1111111110001010 

1/7 

16 

1111111110010111 

1/8 

16 

1111111110101001 

1/9 

16 

1111111110101010 

1/A 

16 

1111111110101011 

11 


2/1 

5 

2/2 

7 

2/3 

10 

2/4 

16 

2/5 

16 

2/6 

16 

2/7 

16 

2/8 

16 

2/8 

16 

2/ A 

16 

3/1 

5 

3/2 

8 

3/3 

10 

3/4 

16 

3/5 

16 

3/6 

16 

3/7 

16 

3/8 

16 

3/9 

16 

3/A 

16 

4/1 

6 

4/2 

10 

4/3 

13 

4/4 

16 

4/5 

16 

4/6 

16 

4/7 

16 

4/8 

16 

4/9 

16 

4/A 

16 

5/1 

7 

5/2 

11 

5/3 

16 

5/4 

16 

5/5 

16 

5/6 

16 

5/7 

16 

5/8 

16 

5/9 

16 

5/A 

16 

6/1 

7 

6/2 

14 

6/3 

16 

6/4 

18 

6/5 

18 

6/6 

16 

6/7 

16 

6/8 

16 

6/9 

16 

6/A 

16 

7/1 

6 

7/2 

14 

7/3 

16 

7/4 

16 

7/5 

16 

7/6 

16 

7/7 

16 

7/8 

16 

7/9 

16 

7/A 

16 

B/1 

9 

8/2 

16 

8/3 

16 

8/4 

16 

8/5 

16 

8/6 

16 

11011 

1111000 

1111111000 

1111111110000010 

1111111110001011 

1111111110011111 

1111111110101100 

1111111110101101 

1111111110101110 

1111111110101111 

11100 

11111000 

1111111001 

1111111110000011 

1111111110010000 

1111111110100001 

1111111110110000 

1111111110110001 

1111111110110010 

1111111110110011 

111011 

1111111010 

1111111101110 

1111111110010001 

1111111110100000 

1111111110110100 

1111111110110101 

1111111110110110 

1111111110110111 

1111111110111000 

1111001 

11111111010 

1111111110000111 

1111111110011000 

1111111110100011 

1111111110111001 

1111111110111010 

1111111110111011 

1111111110111100 

1111111110111101 

1111010 

11111111011110 

1111111110010010 

1111111110011001 

1111111110111110 

1111111110111111 

1111111111000000 

1111111111000001 

1111111111000010 

1111111111000011 

11111001 

11111111011111 

1111111110010011 

liiimiioiooioo 

1111111111000100 

1111111111000101 

1111111111000110 

1111111111000111 

1111111111001000 

1111111111001001 

111110110 

1111111110000100 

1111111110001100 

1111111110011010 

1111111111001010 

1111111111001011 
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FROM  NEC  POC  .  ???*  09-  l  T 


1993.  <4.  3  17:16 


8/7 

16 

8/8 

16 

8/9 

16 

8 /A 

16 

5/1 

9 

9/2 

18 

9/3 

16 

9/4 

18 

9/5 

18 

9/6 

16 

9/7 

16 

9/8 

18 

9/9 

16 

9/A 

16 

A/1 

9 

A/2 

16 

A/3 

18 

A/4 

16 

A/5 

16 

A/6 

16 

A/7 

16 

A/8 

16 

A/9 

16 

A/A 

16 

B/l 

9 

B/2 

16 

B/3 

16 

B/4 

18 

B/5 

16 

B/6 

16 

B/7 

16 

B/8 

16 

B/9 

16 

B/A 

16 

c/i 

9 

C/2 

16 

C/3 

16 

C/4 

16 

C/5 

16 

C/6 

16 

C/7 

16 

C/8 

•16 

C/9 

16 

C/A 

16 

0/1 

10 

D/2 

16 

D/3 

16 

D/4 

16 

D/5 

16 

D/6 

16 

D/7 

16 

D/8 

16 

D/9 

16 

D/A 

16 

E/1 

10 

E/2 

16 

E/3 

16 

E/4  . 

16 

E/5 

16 

E/6 

16 

E/7 

16 

E/8 

16 

E/9 

16 

E/A 

16 

F/O 

8 

F/l 

16 

mu  liiiioonoo 
1111111111001101 
1111111111001110 
1111111111001111 
111110111 
1111111110000101 
1111111110010100 
1111111110011011 
1111111111010000 
1111111111010001 
1111111111010010 
1111111111010011 
1111111111010100 
1111111111010101 
111111000 
1111111110001000 
1111111110001101 
1111111110011100 
1111111111010110 
1111111111010111 
1111111111011000 
1111111111011001 
1111111111011010 
1111111111011011 
111111001 
1111111110000110 
1111111110010101 
1111111110011101 
1111111111011100 
1111111111011101 
1111111111011110 
1111111111011111 
1111111111100000 
1111111111100001 
111111010 
1111111110001001 
1111111110011110 
1111111110100110 
1111111111100010 
1111111111100011 
1111111111100100 
1111111111100101 
1111111111100110 
1111111111100111 
1111111011 
1111111110001110 
1111111110100101 
1111111111101000 
1111111111101001 
1111111111101010 
1111111111101011 
1111111111101100 
1111111111101101 
.  miiiiiiiioiiio 
1111111100 
1111111110010110 
1111111111101111 
1111111111110000 
1111111111110001 
1111111111110010 
1111111111110011 
1111111111110100 
1111111111110101 
1111111111110110 
11111010 

1111111110000000 
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F/2 
.  F/3 
F/4 
F/5 
F/6 
F/7 
F/8 
F/9 
F/A 


16 

16 

16 

16 

16 

16 

16 

16 

16 


1111111110100010 

1111111111110111 

1111111111111000 

1111111111111001 

1111111111111010 

1111111111111011 

1111111111111100 

1111111111111101 

111111111111111° 
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